Introduction

Some Informations about the projekt here would be really nice.

Analysis

2.1a Fraction of Xenelogs vs. Number of Genes

Short description of the task at hand.

Tab. 1: Dependecy of the size of the gene tree in respect to th number of genes. Results are shown for each group, slope und intercept of a linear model were calculated as well as the spearman correlation.
Group Duplication_Rate Loss_Rate HGT_Rate Slope Intercept Spearman_Corr
P0 0.25 0.25 0.25 0.0014 0.23 0.21
P1 0.50 0.50 0.50 -0.0009 0.39 0.05
P2 0.50 0.50 1.00 -0.0012 0.48 -0.02
P3 0.50 0.50 1.50 -0.0021 0.58 -0.16
P4 1.00 1.00 0.50 -0.0011 0.40 0.04
P5 1.00 1.00 1.00 -0.0016 0.53 -0.14
P6 1.50 1.50 1.50 -0.0016 0.57 -0.24

2.1a) Plots: Fraction of Xenelogs vs. Number of Genes

In the following section seven plots are shown which represent the dependency of the Fraction of Xenologs from the Number of Genes. For each group (see Tab. 1) a seperate plot and a linear model was calculated to extract the slope and intercept.

Scatterplot of the *Fraction of Xenologs* plotted against the *Number of Genes*  including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the Fraction of Xenologs plotted against the Number of Genes including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the *Fraction of Xenologs* plotted against the *Number of Genes*  including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the Fraction of Xenologs plotted against the Number of Genes including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the *Fraction of Xenologs* plotted against the *Number of Genes*  including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the Fraction of Xenologs plotted against the Number of Genes including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the *Fraction of Xenologs* plotted against the *Number of Genes*  including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the Fraction of Xenologs plotted against the Number of Genes including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the *Fraction of Xenologs* plotted against the *Number of Genes*  including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the Fraction of Xenologs plotted against the Number of Genes including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the *Fraction of Xenologs* plotted against the *Number of Genes*  including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the Fraction of Xenologs plotted against the Number of Genes including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the *Fraction of Xenologs* plotted against the *Number of Genes*  including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the Fraction of Xenologs plotted against the Number of Genes including a lineare model (red line) with the confidence intervall (grey).

In figure 1 the fraction of xenologs of group \(P0\) is plotted against the number of genes. The linear model shows a positive relationship between those parameters. The slope and intercept are \(0.0014\) and repsectively \(0.23\) with a spearman correlation coefficient of \(0.21\). Figure 2 and 3 show a negative relationship regarding the fraction of xenelogs in dependence of the number of genes. Figure 4 shows the highest negative intercept for all groups (\(-0.0021\)) with a spearman correlation of \(-0.16\). Figures 5 to 7 showing as well a negative relationship between the two parameters, whereas group \(P6\) has the best spearman correlation with \(-0.24\).

CONCLUSION????

2.1a) Fraction of Xenologs vs. Number of Species

Each plot needs a short discription. This can be done here. Maybe it is better not to iterate over the groups. Maybe we should split the following code into \(7\) seperate sektions, so we can write a custom text for each.

Tab. 2: Dependecy of the size of the gene tree in respect to th number of species. Results of for each Group, slope und intercept of a linear model were calculated as well as the spearman correlation value.
Group Duplication_Rate Loss_Rate HGT_Rate Slope Intercept Spearman_Corr
P0 0.25 0.25 0.25 0.0015 0.23 0.09
P1 0.50 0.50 0.50 0.0004 0.34 0.04
P2 0.50 0.50 1.00 0.0004 0.42 0.02
P3 0.50 0.50 1.50 -0.0022 0.55 -0.09
P4 1.00 1.00 0.50 0.0002 0.34 0.02
P5 1.00 1.00 1.00 -0.0016 0.49 -0.04
P6 1.50 1.50 1.50 0.0001 0.46 0.00

2.1a) Plots: Fraction of Xenelogs vs. Number of Species

During the simulation, each tree was given a random number of maximum species ranging from \(10\) to \(50\). In the following section seven plots are shown which represent the dependency of the Fraction of Xenologs from the Number of Species. For each group (see Tab. 2) a seperate plot and a linear model was calculated to calculate the slope and intercept of the dependency.

Scatterplot of the *Fraction of Xenologs* plotted against the *Number of Species*  including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the Fraction of Xenologs plotted against the Number of Species including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the *Fraction of Xenologs* plotted against the *Number of Species*  including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the Fraction of Xenologs plotted against the Number of Species including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the *Fraction of Xenologs* plotted against the *Number of Species*  including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the Fraction of Xenologs plotted against the Number of Species including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the *Fraction of Xenologs* plotted against the *Number of Species*  including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the Fraction of Xenologs plotted against the Number of Species including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the *Fraction of Xenologs* plotted against the *Number of Species*  including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the Fraction of Xenologs plotted against the Number of Species including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the *Fraction of Xenologs* plotted against the *Number of Species*  including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the Fraction of Xenologs plotted against the Number of Species including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the *Fraction of Xenologs* plotted against the *Number of Species*  including a lineare model (red line) with the confidence intervall (grey).

Scatterplot of the Fraction of Xenologs plotted against the Number of Species including a lineare model (red line) with the confidence intervall (grey).

All groups, except \(P3\) and \(P5\), showing a positive correlation regarding the Fraction of Xenologs in dependence o the Number of Species. The result for each group is visualized in Tab. 2. The spearman correlation coefficients range from \(-0.9\) to \(0.09\).

CONCLUSION:

2.1b) Fraction of Xenelogs with a fixed HGT

As shown in Tab. 1 and Tab. 2 the duplication and loss rate is increasing in the same rate for each simulation group. Therefore we combine the duplication and loss rate into one factor.

Plot: Duplication / Loss Rate

Short introduction- Loss and Duplikation are alwayss the same. Hence just one Figure

Fig. 15: Boxplot of the Fraction of Xenologs plotted against the duplication rate with a fixed horizontal gene transfer (HGT) rate. The different colors marking the groups with the same HGT rate.

Fig. 15: Boxplot of the Fraction of Xenologs plotted against the duplication rate with a fixed horizontal gene transfer (HGT) rate. The different colors marking the groups with the same HGT rate.

In XX we see examples of plotting in R. Explanation of the graph

2.1c

How does the fraction depend on the horizontal transfer rate on the the rate of duplications and losses?

How does the fraction depend on the horizontal transfer rate with a fixed duplication and loss rate? --> fixed Question??

2.1c Plots Fraction vs. HGT fixes Loss

Boxplot of the Fraction of Xenologs plotted against the HGT rate with a fixed duplikation and loss rate. The different colors marking the groups with the same duplikation or loss rate.

Boxplot of the Fraction of Xenologs plotted against the HGT rate with a fixed duplikation and loss rate. The different colors marking the groups with the same duplikation or loss rate.

2.1d

How does the fraction depend on the frequency of multifurctions.

2.1d Plot

AUCH HIER GEFÄLLT MIR DER PLOTT ÜBERHAUPT NICHT. BOXPLOT ÜBER ALLE GRUPPEN? ODER 6 EINZELNE PLOTS FÜR JEDE GRUPPE MIT LINEAREM ODER x^2 MODEL? ODER WIR BILDEN "BUCKETS" MIT x=NON_BINARY_PROB VS y=FRACTION_OF_XENOLOGS. und wir bilden buckets mit 1 = [0-0.1], 2 = [0.1-0.2], 3 = [0.2-0.3] usw..

caption = "Wasn hier zu sehen??"

plot(x = treeDataDf$non_binary_prob,
     y = treeDataDf$Fraction_of_Xenologs)
Wasn hier zu sehen??

Wasn hier zu sehen??

2.2 Fitch from LDT with CD

Second we consider the dependencies for the edges in Fitch graphs computed from an LDT graph. Here the following variants should be considered:

  • Complete multipartite graph obtained by solving the Cluster Deletion Problem for the complement of the LDT (see webpage).
  • The \(rs-Fitch\) graph of the scenario computed with “Algorithm 1” from Rbelow.pdf (the latter is already implemented in AsymmeTree).@

@Paul, hast du hier nicht schonmal was angefangen?

In meinem Script finde ich dazu nichts.

Plots

Die können hier eingefügt werden

3. Triples: Characterization of LDT Graph

The triple set \(T (G)\) is related to the gene tree, while the triple set \(S(G, σ)\) is related to the species tree. It is therefore of interest to compare to what extent \(T (G)\) and \(S(G, σ)\) overlap the triple sets of true gene tree and the triple set of the true species tree, respectively. How can this be quantified in a meaningful way? Again we are interested in the dependence of the simulation parameters.

Tab. 3: Some Caption
Group recall_cd_mean_100 recall_cd_mean_80 recall_cd_mean_60 recall_cd_mean_40 recall_cd_mean_20
P0 0.67 0.66 0.65 0.61 0.50
P1 0.67 0.66 0.64 0.59 0.51
P2 0.67 0.66 0.66 0.63 0.56
P3 0.70 0.70 0.68 0.65 0.58
P4 0.64 0.63 0.63 0.58 0.49
P5 0.69 0.69 0.68 0.65 0.56
P6 0.69 0.68 0.67 0.64 0.61
Tab. 4: Some Caption
Group precision_cd_mean_100 precision_cd_mean_80 precision_cd_mean_60 precision_cd_mean_40 precision_cd_mean_20
P0 0.85 0.86 0.89 0.88 0.87
P1 0.88 0.88 0.91 0.91 0.90
P2 0.91 0.92 0.93 0.93 0.94
P3 0.94 0.95 0.95 0.96 0.96
P4 0.86 0.87 0.88 0.88 0.89
P5 0.93 0.93 0.94 0.94 0.93
P6 0.95 0.95 0.95 0.95 0.97
Tab. 5: Some Caption
Group accuracy_cd_mean_100 accuracy_cd_mean_80 accuracy_cd_mean_60 accuracy_cd_mean_40 accuracy_cd_mean_20
P0 0.97 0.98 0.98 0.99 1.00
P1 0.95 0.96 0.98 0.99 1.00
P2 0.93 0.95 0.97 0.98 1.00
P3 0.92 0.94 0.96 0.98 0.99
P4 0.94 0.96 0.97 0.99 1.00
P5 0.93 0.95 0.97 0.98 1.00
P6 0.92 0.94 0.96 0.98 0.99
Tab. 6: Some Caption
Group recall_rs_mean_100 recall_rs_mean_80 recall_rs_mean_60 recall_rs_mean_40 recall_rs_mean_20
P0 0.70 0.67 0.65 0.62 0.51
P1 0.71 0.69 0.65 0.59 0.53
P2 0.73 0.70 0.69 0.66 0.58
P3 0.75 0.74 0.72 0.67 0.60
P4 0.70 0.66 0.66 0.60 0.52
P5 0.74 0.73 0.71 0.67 0.58
P6 0.75 0.74 0.72 0.68 0.64
Tab. 6: Some Caption
Group precision_rs_mean_100 precision_rs_mean_80 precision_rs_mean_60 precision_rs_mean_40 precision_rs_mean_20
P0 0.83 0.83 0.84 0.86 0.85
P1 0.85 0.85 0.86 0.88 0.89
P2 0.91 0.89 0.91 0.91 0.93
P3 0.92 0.93 0.93 0.94 0.95
P4 0.86 0.84 0.86 0.84 0.90
P5 0.91 0.91 0.91 0.92 0.92
P6 0.94 0.93 0.94 0.94 0.95
Tab. 6: Some Caption
Group accuracy_rs_mean_100 accuracy_rs_mean_80 accuracy_rs_mean_60 accuracy_rs_mean_40 accuracy_rs_mean_20
P0 0.97 0.97 0.98 0.99 1.00
P1 0.95 0.96 0.97 0.99 1.00
P2 0.94 0.95 0.97 0.98 1.00
P3 0.93 0.95 0.97 0.98 0.99
P4 0.95 0.96 0.97 0.99 1.00
P5 0.94 0.95 0.97 0.98 1.00
P6 0.93 0.95 0.96 0.98 0.99

Plots Cluster Deletion

Sinvolle beschreibung des Plottes hier

Sinvolle beschreibung des Plottes hier

Sinvolle beschreibung des Plottes hier

Sinvolle beschreibung des Plottes hier

Sinvolle beschreibung des Plottes hier

Sinvolle beschreibung des Plottes hier

Plots Fitch (RS)

Sinvolle beschreibung des Plottes hier

Sinvolle beschreibung des Plottes hier

Sinvolle beschreibung des Plottes hier

Sinvolle beschreibung des Plottes hier

Sinvolle Beschreibung des Plottes hier

Sinvolle Beschreibung des Plottes hier

Tripple T Plots

Hier stimmt was mit den Daten nicht so ganz. Über all \(1\) drin.

Recall

Beschreibender Text hier.

Beschreibender Text hier.

Summary Table Recall

Tab. 6: Recall
Group T_LDT_Recall_Mean T_LDT_Recall_Median S_LDT_Recall_Mean S_LDT_Recall_Median
P0 1 1 1 1
P1 1 1 1 1
P2 1 1 1 1
P3 1 1 1 1
P4 1 1 1 1
P5 1 1 1 1
P6 1 1 1 1

Precision

Beschreibender Text hier.

Beschreibender Text hier.

Precision Summary

Tab. 6: Precision
Group T_LDT_Precision_Mean T_LDT_Precision_Median S_LDT_Precision_Mean S_LDT_Precision_Median
P0 0.05 0.01 0.07 0.00
P1 0.09 0.04 0.10 0.04
P2 0.13 0.09 0.18 0.11
P3 0.14 0.10 0.23 0.18
P4 0.09 0.04 0.10 0.04
P5 0.12 0.08 0.17 0.11
P6 0.14 0.10 0.20 0.13

Accuracy Plot

Accuracy Beschreibender Text hier.

Accuracy Beschreibender Text hier.

Accuracy Summary

Tab. 6: Precision
Group T_LDT_Precision_Mean T_LDT_Precision_Median S_LDT_Precision_Mean S_LDT_Precision_Median
P0 0.05 0.01 0.07 0.00
P1 0.09 0.04 0.10 0.04
P2 0.13 0.09 0.18 0.11
P3 0.14 0.10 0.23 0.18
P4 0.09 0.04 0.10 0.04
P5 0.12 0.08 0.17 0.11
P6 0.14 0.10 0.20 0.13

Tripple S

Recall

Recall: Beschreibender Text hier.

Recall: Beschreibender Text hier.

Precision

Precision: Beschreibender Text hier.

Precision: Beschreibender Text hier.

Accuracy

Accuracy: Beschreibender Text hier.

Accuracy: Beschreibender Text hier.

Tripple T Fraction

Tripple T Fraction: Beschreibender Text hier.

Tripple T Fraction: Beschreibender Text hier.

Tripple S Fraction

Tripple S Fraction: Beschreibender Text hier.

Tripple S Fraction: Beschreibender Text hier.